6 research outputs found

    Policy learning in Continuous-Time Markov Decision Processes using Gaussian Processes

    Get PDF
    Continuous-time Markov decision processes provide a very powerful mathematical framework to solve policy-making problems in a wide range of applications, ranging from the control of populations to cyber\u2013physical systems. The key problem to solve for these models is to efficiently compute an optimal policy to control the system in order to maximise the probability of satisfying a set of temporal logic specifications. Here we introduce a novel method based on statistical model checking and an unbiased estimation of a functional gradient in the space of possible policies. Our approach presents several advantages over the classical methods based on discretisation techniques, as it does not assume the a-priori knowledge of a model that can be replaced by a black-box, and does not suffer from state-space explosion. The use of a stochastic moment-based gradient ascent algorithm to guide our search considerably improves the efficiency of learning policies and accelerates the convergence using the momentum term. We demonstrate the strong performance of our approach on two examples of non-linear population models: an epidemiology model with no permanent recovery and a queuing system with non-deterministic choice

    Approximating values of generalized-reachability stochastic games

    Get PDF
    Simple stochastic games are turn-based 2½-player games with a reachability objective. The basic question asks whether one player can ensure reaching a given target with at least a given probability. A natural extension is games with a conjunction of such conditions as objective. Despite a plethora of recent results on the analysis of systems with multiple objectives, the decidability of this basic problem remains open. In this paper, we present an algorithm approximating the Pareto frontier of the achievable values to a given precision. Moreover, it is an anytime algorithm, meaning it can be stopped at any time returning the current approximation and its error bound

    Stochastic Shortest Paths and Weight-Bounded Properties in Markov Decision Processes

    No full text
    International audienceThe paper deals with finite-state Markov decision processes (MDPs) with integer weights assigned to each state-action pair. New algorithms are presented to classify end components according to their limiting behavior with respect to the accumulated weights. These algorithms are used to provide solutions for two types of fundamental problems for integer-weighted MDPs. First, a polynomial-time algorithm for the classical stochastic shortest path problem is presented, generalizing known results for special classes of weighted MDPs. Second, qualitative probability constraints for weight-bounded (repeated) reachability conditions are addressed. Among others, it is shown that the problem to decide whether a disjunction of weight-bounded reachability conditions holds almost surely under some scheduler belongs to NP ∩ coNP, is solvable in pseudo-polynomial time and is at least as hard as solving two-player mean-payoff games, while the corresponding problem for universal quantification over schedulers is solvable in polynomial time

    LNCS

    No full text
    A probabilistic vector addition system with states (pVASS) is a finite state Markov process augmented with non-negative integer counters that can be incremented or decremented during each state transition, blocking any behaviour that would cause a counter to decrease below zero. The pVASS can be used as abstractions of probabilistic programs with many decidable properties. The use of pVASS as abstractions requires the presence of nondeterminism in the model. In this paper, we develop techniques for checking fast termination of pVASS with nondeterminism. That is, for every initial configuration of size n, we consider the worst expected number of transitions needed to reach a configuration with some counter negative (the expected termination time). We show that the problem whether the asymptotic expected termination time is linear is decidable in polynomial time for a certain natural class of pVASS with nondeterminism. Furthermore, we show the following dichotomy: if the asymptotic expected termination time is not linear, then it is at least quadratic, i.e., in Ω(n2)

    Placing unprecedented recent fir growth in a European-wide and Holocene-long context

    No full text
    Forest decline played a pivotal role in motivating Europe's political focus on sustainability around 35 years ago. Silver fir (Abies alba) exhibited a particularly severe dieback in the mid-1970s, but disentangling biotic from abiotic drivers remained challenging because both spatial and temporal data were lacking. Here, we analyze 14 136 samples from living trees and historical timbers, together with 356 pollen records, to evaluate recent fir growth from a continent-wide and Holocene-long perspective. Land use and climate change influenced forest growth over the past millennium, whereas anthropogenic emissions of acidic sulfates and nitrates became important after about 1850. Pollution control since the 1980s, together with a warmer but not drier climate, has facilitated an unprecedented surge in productivity across Central European fir stands. Restricted fir distribution prior to the Mesolithic and again in the Modern Era, separated by a peak in abundance during the Bronze Age, is indicative of the long-term interplay of changing temperatures, shifts in the hydrological cycle, and human impacts that have shaped forest structure and productivity
    corecore